🔀 SIMD Programming - hello · Scour

🛣️Highway zeux.io·

Zigzag decoding with AVX-512

Covers uops.info

Discussed on Hacker News

🛣️Highway Phoronix·

Rust PNG Image Decoder Now Even Faster: Benefiting Chrome, GNOME, Etc

⚡SIMD Optimization shnatsel.medium.com·

Safe SIMD in Rust, even on the inside

Discussed on Hacker News and Lobsters

🛣️Highway Cryptology ePrint Archive·

Parameter-Aware and Instruction-Driven Dilithium Optimization on AVX2 and NEON

🔢Pgvector arxiv.org·

MonaVec: A Training-Free Embedded Vector Search Kernel for Edge and Offline AI Systems

Covers Easy way to do both: async <-> sync (crates.io dump loading and parsing example)

⚡Hardware Acceleration Akin Ocal·

Building a High-Throughput FIX Server

Discussed on Substack

⚡Hardware Acceleration indianspeedster.github.io·

Occupancy Math on the AMD MI355X: A From-First-Principles Guide

Discussed on Hacker News, Hacker News, and Hacker News

Less-relevant results

⚡RISC-V atticarun.itch.io·

Foundry-5: browser puzzle game that teaches you real RISC-V assembly

Discussed on Hacker News

🛣️Highway blog.image-rs.org·

Rust PNG crate gets even faster, used by GNOME and Chromium

Covers google/oss-fuzz

Covered by Phoronix

Discussed on Hacker News

🔨Compilers fil-c.org·

Memory Safe Inline Assembly

Discussed on Hacker News

🛣️Highway LXer Linux News·

Revised AVX-512 xor_gen() Implementation For Linux RAID Yielding More Performance Gains

🔍RAG DEV Community·

Speed, Accuracy, and Efficiency: Benchmarking Endee vs. Google Vertex AI

Discussed on DEV

🦙Ollama GitHub·

Building a CPU LLM engine in C99 - stuck at 1.90 tok/s on DeepSeek MoE while llama.cpp does 13.79. Potential root cause identified. Implementation is not.

Discussed on r/LocalLLaMA

🔧LLVM IR Optimization hiraditya.github.io·

Loop Unrolling in the ML Era

Discussed on Hacker News

⚡Hardware Acceleration medium.com

·

Safe SIMD in Rust, even on the inside

🏗️Build Systems GitHub·

RunEdgeAI/turboquant.cpp: Near-optimal online vector quantization in C++23 — 1-4 bits per coordinate, no training, no codebooks

Covers TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Discussed on Hacker News

🔢Sparse Matrices arxiv.org·

Evaluating Rust for Sparse Matrix Kernels in Scientific Computing

⚛️Quantum Computing arxiv.org·

Diagonal-Budgeted Trotterization for Efficient Quantum Hamiltonian Simulation

🔧HPCToolkit arxiv.org·

Is RISC-V Ready for Massively Parallel Astrophysical Codes?

👁️Computer Vision arxiv.org·

Experimental Analysis of Neural Network-Based Image Classification on the CIFAR-10 Dataset

No more posts from hello's subscribed feeds.

Scour all 25,324 feeds Learn more about Feeds

Log in to enable infinite scrolling